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Abstract 

Cinnamyl alcohol dehydrogenase (CAD) catalyses the final step of the monolignol biosynthesis, the conversion of 
cinnamyl aldehydes to alcohols, using NADPH as a cofactor. Seven members of the CAD gene family were identi- 
fied in the genome of Brachypodium distachyon and five of these were isolated and cloned from genomic DNA. 
Semi-quantitative reverse-transcription PCR revealed differential expression of the cloned genes, with BdCAD5 being 
expressed in all tissues and highest in root and stem while BdCAD3 was only expressed in stem and spikes. A phy- 
logenetic analysis of CAD-like proteins placed BdCAD5 on the same branch as bona fide CAD proteins from maize 
(ZmCAD2), rice (OsCAD2), sorghum (SoCAD2) and Arabidopsis (AfCAD4, 5). The predicted three-dimensional struc- 
tures of both BdCAD3 and BdCAD5 resemble that of AfCAD5. However, the amino-acid residues in the substrate- 
binding domains of BdCAD3 and BdCAD5 are distributed symmetrically and BdCAD3 is similar to that of poplar 
sinapyl alcohol dehydrogenase (PofSAD). BdCAD3 and BdCAD5 expressed and purified from Escherichia coli both 
showed a temperature optimum of about 50 °C and molar weight of 49kDa. The optimal pH for the reduction of 
coniferyl aldehyde were pH 5.2 and 6.2 and the pH for the oxidation of coniferyl alcohol were pH 8 and 9.5, for BdCAD3 
and BdCAD5 respectively. Kinetic parameters for conversion of coniferyl aldehyde and coniferyl alcohol showed that 
BdCAD5 was clearly the most efficient enzyme of the two. These data suggest that BdCAD5 is the main CAD enzyme 
for lignin biosynthesis and that BdCAD3 has a different role in Brachypodium. All CAD enzymes are cytosolic except 
for BdCAD4, which has a putative chloroplast signal peptide adding to the diversity of CAD functions. 

Key words: Brachypodium distachyon (Bd21 -3), coniferyl aldehyde, Cinnamyl alcohol dehydrogenase (CAD), gene structure, 
lignocellulose, recalcitrance, signal peptide. 



Introduction 

Utilization of lignocellulosic plant material for biofuel produc- 
tion has regained importance in society. In order to be an eco- 
nomically viable solution to biofuel production, it is necessary to 
develop strategies to overcome the recalcitrance of lignin, which 
is a limiting factor in the degradation of cellulose into sugars. The 
current understanding of lignin biosynthesis has been obtained 
from research in different areas. Improved yields in the pulp and 
paper industry promoted research in lignocellulose in woody spe- 
cies and, during the 1980s, the phenylalanine pathway providing 



the monolignols, the building blocks of lignin, was studied as 
an important part of the plant defence towards pathogens. Thus, 
manipulation of the lignin biosynthesis pathway has been pro- 
posed as a possible solution to reduce recalcitrance. Brown mid- 
rib mutants in maize were identified in the 1920s (Jorgenson, 
1931) but it was much later that their potential for improving 
digestibility was realized. Existing mutants in maize and sor- 
ghum, known as brown midrib mutants, with altered lignin bio- 
synthesis, have been shown to have increased digestibility (for 
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review see Barriere et ah, 2004; Guillaumie et al, 2007; Sattler 
et al, 2010). The brown midrib phenotype has been linked to 
mutations in a cinnamyl alcohol dehydrogenase (CAD) gene in 
bmr6 sorghum (Sattler et al., 2010). In maize, a link between the 
expression of CAD genes and the brown midrib phenotype was 
found, but as no CAD mutants have been identified it is specu- 
lated that the mutation reside in a transcription factor (Guillaumie 
et al, 2007). The CAD-like gene family has been investigated 
in a number of plant species such as sorghum (Saballos et al., 
2009; Sattler et al., 2009), rice (Tobias and Chow, 2005; Li 
et al, 2009), switchgrass (Fu et al, 2011; Saathoff et al, 2011), 
poplar (Lapierre et al, 2000, 2004; Barakat et al, 2009, 2010), 
pine (MacKay et al, 1995), white spruce (Bedon et al, 2009), 
Arabidopsis (Kim et al, 2004, 2007; Sibout et al, 2003, 2005), 
tobacco (Halpin et al, 1994; Damiani et al, 2005), sweet potato 
(Kim et al, 2010), wheat (Ma, 2010), and maize (Halpin et al, 
1998; Marita et al, 2003; Guillaumie et al, 2007). 

The biosynthetic pathway of lignin monomers and the genes 
involved therein have been thoroughly reviewed (Dixon et al, 
2001; Boerjan et al, 2003; Vanholme et al, 2008). Lignin is a 
complex heteropolymer formed by coupling of different mono- 
lignol subunits, and /7-coumaryl alcohol (H), coniferyl alcohol 
(G), and sinapyl alcohol (S) are the three most frequent (Boerjan 
et al, 2003). Lignin content and composition vary between plant 
species, between different tissues as well as between the differ- 
ent layers of the plant cell wall (Campbell and Sederoff, 1996; 
Hisano et al, 2009). The lignin of most dicotyledonous angio- 
sperms consists mainly of G- and S-monolignols, whereas mono- 
cots also contain H-monolignol (Dixon et al, 2001; Bonawitz 
and Chappie, 2010). Ten enzymes are believed to be required 
for the biosynthesis of the three monolignols, with phenylalanine 
ammonia lyase as the initial enzyme of the pathway and CAD 
catalysing the last step of the reduction of the three aldehydes to 
alcohols (Li et al, 2008; Hisano et al, 2009). 

Modification of the lignin concentration and/or composition, 
in order to increase digestibility and saccharification efficiency, 
has been frequently attempted by manipulating the expression of 
the different genes in the lignin biosynthesis pathway (for recent 
reviews see (Li et al, 2008; Vanholme et al, 2008)). These genes 
provide the monomeric building blocks and they are the primary 
target in many research projects aiming at reducing the recalci- 
trance of lignocellulosic material. This is often referred to as the 
low-hanging fruit strategy. The most pronounced effect on lignin 
content is obtained by suppressing genes in the early steps of 
the pathway, whereas reducing CAD activity rather affects lignin 
composition according to Kim et al. (2002) and Li et al. (2008). 
Decreasing the lignin content in order to improve digestibility 
can result in plants with impaired growth (Chabannes et al, 
2001; Chen and Dixon, 2007). However, it appears that manipu- 
lating enzymes in the later steps of the pathway, e.g. CAD genes, 
has less, if any, effect on the plant biomass production (Bonawitz 
and Chappie, 2010). 

CAD catalyses the final step of the monolignol biosynthesis, 
the conversion of cinnamyl aldehydes to alcohols, using NADPH 
as a co factor (Sattler et al, 2010). Early reports on the identifica- 
tion of CAD enzymes suggested that only one form exists in most 
plant species. However, in Eucalyptus gunnii two CAD isoforms 
were isolated and named iigCADl and £gCAD2 (Goffher et al, 



1992) and recent studies have indicated that angiosperms contain 
a family of CAD-like genes, with nine members in Arabidopsis 
(Kim et al, 2004), 14 members in sorghum (Saballos et al, 2009), 
and 12 members in rice (Tobias and Chow, 2005). However, when 
comparing biochemical properties of isgCADl and £gCAD2, 
£gCAD2 was considered to be the form most likely to be involved 
in lignification (Goffher et al, 1992) and is thus regarded to be a 
bona fide CAD. Furthermore, the sequence of EgC AD 1 (accession 
CAA61275) isolated from Eucalyptus is not conserved in amino- 
acid residues which are believed to be essential and characteristic 
for CAD function, e.g. in the zinc-binding domains (Youn et al, 
2006). Based on its amino-acid sequence, iigCADl is actually 
more similar to the cinnamoyl-CoA reductase family acting in the 
middle of the lignin biosynthetic pathway (Goffher et al, 1998). 
In the early stage of identifying CAD-like genes, the sequences 
ELI -I (At CADT) and ELI3-2 (At CAD8) from Arabidopsis were 
identified but not biochemically characterized. However an ELD 
homologue from celery was isolated and described as a mannitol 
dehydrogenase (Stoop and Pharr, 1992). The annotation was later 
changed to benzyl alcohol dehydrogenase showing low catalytic 
activity against monolignol compounds according to biochemical 
analysis (Somssich et al, 1996). The initial mannitol dehydro- 
genase annotation has thus resulted in misleading annotations in 
databases because these were not substantiated by reliable bio- 
chemical experiments of homologues to CAD-like enzymes (Kim 
et al, 2007). Proteins with both proven mannitol dehydrogenase 
activity and known protein structure are short-length dehydroge- 
nases of 265-280 residues and lack Zn-binding sites (see Niiss 
et al, 2010). CAD-like enzymes are typically mid-length dehy- 
drogenases of 350 amino acids. This illustrates that biochemical 
characterization of enzymes encoded by cloned genes in order to 
confirm their putative function (Kim et al, 2004) is needed. 

This study identifies seven CAD-like genes and isolated five 
in Brachypodium distachyon, the model plant for temperate 
grasses. The relative expressions of these genes in different plant 
organs as well as in vitro biochemical characterization of the 
enzymes fit/CAD3 and BdCAD5 are presented. As a first step 
to fully assign functions to the five CAD genes in developing 
Brachypodium plants, the function of &/CAD3 and BdCAD5 
as well as BdCADA, which has a chloroplast-targeting signal 
peptide, are discussed. It is anticipated that detailed knowledge 
of these genes will support a strategy to reduce recalcitrance by 
modifying lignin composition with minor biomass yield penalty, 
which subsequently can be handle by plant breeding. 

Materials and methods 

Plant material 

The B. distachyon genotype Bd2 1 -3 was used for all experiments. Plants 
were grown in a naturally lit greenhouse with standard irrigation and 
fertilization. Plants were harvested at the seed-filling stage. Harvested 
plant material were immediately frozen in liquid nitrogen and stored at 
-80 °C until use. 

Isolation ofgDNA and RNA and synthesis ofcDNA 

RNA was extracted from different Brachypodium tissues using a 
RNeasy Kit (Qiagen, UK), according to the manufacturer's protocol. 
RNA was treated with RQ1 RNase-Free DNase (Pro mega, USA) before 
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reverse transcription into cDNA using iScript cDNA Synthesis Kit (Bio- 
Rad, USA) or M-MuLV Reverse Transcriptase RNase H (Finnzymes, 
FI) according to the manufacturer's protocol using a p(dT) 18 primer. 
DN A was extracted at the immature seed stage from Brachypodium leaf 
tissue using a DNeasy Plant Mini Kit, according to the manufacturer's 
protocol. 

Cloning of Brachypodium CAD genes 

The Brachypodium genome (www.brachypodium.org) 8x release (ver- 
sion 1.2) was screened in silico for putative CAD sequences using 
known CAD sequences from Arabidopsis, rice, sorghum, maize, wheat, 
and perennial ryegrass as queries. Gateway-specific primers with attBl 
and attB2 sites were designed (Supplementary Table SI, available at 
JXB online) for amplification of the open reading frames of putative 
Brachypodium CAD, including flanking nucleotides from the 5'- and 
3'-UTR respectively to maintain the correct frame. PCR was con- 
ducted on both gDNA and cDNA for amplification of the different 
CAD sequences using LaTaq (Takara, Japan) and buffer [GCI/II buffer 
(Takara) was used for CAD2 from gDNA and CAD7 from cDNA], 
according to the manufacturer's protocol, and a three-step amplification 
program (Supplementary Table SI). All products were cloned into the 
pDONR201 vector, propagated in Escherichia coli ToplO (Invitrogen, 
USA) and inserts were confirmed by sequencing (MWG, Germany). 
Sequence analyses were performed using CLC Main Workbench ver- 
sion 6.6 (CLC bio, Aarhus, Denmark). Sequences were deposited at 
GenBank [accession numbers: JQ768796 (BACAD3) and JQ768797 
(B&CAD5)]. 

Expression of CAD genes in different Brachypodium tissues 

Semi-quantitative reverse-transcription PCR was used to determine 
expression levels in root, stem, leaf, and spike tissues, respectively. 
Gene-specific primer pairs (Supplementary Table SI) flanking introns 
were designed to discern between amplification resulting from cDNA 
and residual gDNA. cDNA concentrations were adjusted to 100 ng ul . 
Parallel samples were run using the primer pair RT-BdUBI4 as a con- 
trol. PCR products were sequenced to confirm primer specificity. 

Alignment 

Protein sequences for the seven putative CAD enzymes in the 
Brachypodium genome 8x release (version 1 .2) and sequences of 79 
experimentally proved or putative CADs from other species were 
aligned with Muscle version 3.7 (Edgar, 2004) configured for highest 
accuracy. Alignment of protein sequences with Muscle and calcula- 
tion of sequence identity was done in CLC Main Workbench version 
6.6. Secondary structure elements and amino-acid residues related to 
the substrate-binding site or zinc ion-binding site (Youn et ah, 2006) 
were annotated using ESPript (Gouet et ah, 1999). The plant protein 
sequences used are listed in Supplementary Table S2. 

Phylogenetic tree 

The protein sequences for 86 putative CAD enzymes were obtained 
from GenBank, Ensembl Genomes project, UniProt, Phytozome, 
and WheatEST and they are described in Supplementary Table S2. 
Sequences were aligned with MUSCLE version 3.7 (Edgar, 2004) and 
ambiguous regions (i.e. containing gaps and/or poorly aligned) were 
removed with Gblocks version 0.91b (Talavera and Castresana, 2007) 
using the default settings and a minimum length of a block after gap 
cleaning of 10. A phylogenetic tree was reconstructed using the max- 
imum-likelihood method implemented in the PhyML program version 
3.0 aLRT (Anisimova and Gascuel, 2006) on the Phylogeny.fr platform 
(Dereeper et at, 2008) with the LG substitution model selected and 
using the default settings. Reliability for internal branch was assessed 
using the bootstrapping method with 100 bootstrap replicates. Graphical 
representation and edition of the phylogenetic tree were performed with 
TreeDyn version 198.3 (Chevenet et al, 2006). 



Structural modelling of B6CAD3 and BdCAD5 

The sequences were analysed for signal peptides using the SignalP 
server (Bendtsen et al, 2004) and the protein sequences were aligned 
using ClustalW version 2 (Thompson et al., 1994; Larkin et al, 2007). 
In each modelling process, 50 models were generated based on the 
reported crystal structure of ^CAD5 (accession 2CF5; Youn et al, 
2006) andPo^SAD (accession 1YQD; Bomati and Noel, 2005) using 
MODELLER version 9.9 (Sali and Blundell, 1993). Heteroatoms from 
the database file (Zn and NAP) were transferred to the model from the 
respective templates by MODELLER and were treated as bulk resi- 
dues in the optimization process. The obtained models were evaluated 
by their DOPE score and the models with best DOPE scores were 
finally analysed and verified using Verify-3D (Bowie et al., 1991; 
Liithy etal, 1992). 



Expression and purification of recombinant enzymes 

The coding region of BdCAD5 and BACAD3 was transferred from the 
donor vector pDONR201 into the expression vector pDEST17 by the 
LR recombination reaction (Gateway; Invitrogen), according to the 
manufacturer's protocol. Cultures (400ml) were grown at 37 °C until 
an A m of approximately 1 was obtained. The cells were cooled to 
16 °C and IPTG was added to a final concentration of 0.5 mM to induce 
expression of the His-tagged CAD protein and grown for another 26 h 
at 16 °C. Cells were harvested at 4000 g and 4 °C for 15 min and then 
stored at -20 °C until use. 

The frozen pellets were thawed on ice and resuspended in lysis buffer 
(50 mM NaH 2 P0 4 , 300 mM NaCl with 10 mM imidazole, pH 8.0). The 
cells were disrupted by sonification on ice and the extracts were cleared 
by centrifugation at 12,000 g and 4 °C for 30 min. To the cleared lysates, 
1 ml Ni-NTA agarose (Qiagen) was added, according to the manufactur- 
er's protocol, and poured into a Ni-NTA superflow columns (Qiagen). 
After washing the column using lysis buffer (starting with lOmM then 
followed by 20 mM imidazole), the His-tagged BdCAD5 were eluted 
with 250 mM imidazole in lysis buffer. 

After desalting, the 5rfCAD5 fraction was applied to a MonoQ HR 
5/5 anion exchange column (GE Healthcare, UK) equilibrated in buffer 
A at a flow rate of 1 ml min . Proteins were eluted by a linear NaCl gra- 
dient from 0 to 1.0M and collected in 1 ml fractions. Fractions showing 
CAD activity were pooled and concentrated. 

The final purification step of BdCAD5 and 5rfCAD3 was size 
exclusion chromatography using a Sephacryl 200 HR 20/60 column 
(GE Healthcare) at 0.5 ml min -1 , equilibrated with 20 mM TRIS-HC1, 
150mM NaCl, pH 7.5. Collected fractions showing CAD activity were 
concentrated and stored in 20 mM TRIS-HC1, 150mM NaCl, pH 7.5, 
5% ethylene glycol, 5mM DTT at -20 °C until analysis. Presence and 
purity of BdCAD5 and BdCAD3 was analysed by SDS-PAGE using 
PageRuler Prestained Protein ladder (Fennentas, CA, USA) and bands 
were visualized by Coomassie Brilliant Blue G-250 (Candiano et al, 
2004). Protein was quantified at 280 nm using a NanoDrop 2000 
(Thermo) and bovine serum albumin as standard. 



Determination of the molecular weight by size exclusion 
chromatography 

BdCAD5 was analysed for dimer conformations by size exclusion chro- 
matography at 0.5 ml min" 1 , equilibrated with 20 mM TRIS-HC1, 150 mM 
NaCl, pH 7.5 and detected at 220 and 280 nm. A combination of the Low 
and the High Molar Weight calibration kits (GE Healthcare), containing 
ferritin (440,000 Da), aldolase (158,000 Da), albumin (67,000 Da), oval- 
bumin (43,000 Da), and ribonuclease A (13,700 Da), was used as stand- 
ard and Blue dextran (2000 kDa) for determining void volume. 



pH dependence for coniferyl aldehyde 

The pH-dependent change in molar absorption of coniferyl aldehyde 
was obtained by measuring the UV-Vis spectra at different pH on a 
Shimatzu UV-3600 spectrophotometer. Coniferyl aldehyde (~6 mg) was 
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solubilized in 1 ml 2-methoxyethanol (Sigma-Aldrich, USA) and further 
diluted in a multi-pH buffer (50 mM sodium citrate, 50 mM Na 2 HP0 4 , 
and 50 mM sodium borate titrated with 4 M NaOH) to 0.03479 mM. The 
buffer was used in the range pH 3.5-1 1.5. 



Enzyme assays 

The enzymatic reduction of coniferyl aldehyde was determined by the 
decrease in absorbance at 340 nm due to the reduction of coniferyl or 
sinapyl aldehyde and NADPH (Wyrambik and Grisebach, 1975). The 
incubation mixture (final volume 1.6ml) contained 0-110 uM coniferyl 
or sinapyl aldehyde and 0-30 uM NADPH in lOOmM sodium acetate, 
pH 5.35 (iWCAD3) or 0-110 uM coniferyl or sinapyl aldehyde and 
0-30 uM NADPH with 10 ul diluted protein in 200 mM potassium 
phosphate buffer, pH 6.25 (BdCAD5). The protein was diluted in dilu- 
tion buffer (20 mM TRIS-HC1, 5% ethylene glycol, and 5mM DTT, pH 
7.5) prior to measurements, to have a satisfactory activity with linearity 
in time. The molar absorptions of coniferyl and sinapyl aldehydes were 
similar in value. 

For the oxidation of coniferyl alcohol, the enzymatic reaction was 
measured by the increase in absorbance at 400 nm due to the formation 
of conifer aldehyde (Wyrambik and Grisebach, 1975). The incubation 
mixture (1.6ml) contained 0-1200 uM coniferyl alcohol, 0-1400 uM 
NADP + , 10 ul diluted protein in 100 mM TRIS-HC1, pH 8.8 (ACAD3) 
or 0-100 uM coniferyl alcohol, 0-70 uM NADP + , 10 ul diluted pro- 
tein in lOOmM TRIS-HC1, pH 8.8 (5rfCAD5). Protein was diluted in 
dilution buffer prior to measurements. The steady-state parameters were 
determined by fitting the rates to the Michaelis-Menten equation using 
the nlrwr package (Ritz and Streibig, 2008) in R (R Development Core 
Team, 2011). 



The pH optimum of catalytic activity 

The pH profile of the catalytic activity of 5rfCAD5 and BdCAD3 was 
determined at pH 4.5-10 using coniferyl aldehyde and NADPH or 
coniferyl alcohol and NADP + as substrates and cofactors, respectively, 
in multi-pH buffer in the range pH 3.5-1 1 .5. 



Temperature optimum of catalytic activity 

The temperature profile of the catalytic activity of 5rfCAD3 and 
5rfCAD5 was determined using coniferyl aldehyde and NADPH as sub- 
strates in a multi-pH buffer pH 5.2 (5rfCAD3) or lOOmM sodium phos- 
phate buffer pH 6.40 (ZWCAD5), according to the optimum pH. Buffer 
was heated to desired temperature before coniferyl aldehyde, NADPH, 
and finally enzyme were added and the activity was measured instantly. 



Tryptic peptide mapping 

The purified BdCAD3 and BdCAD5 proteins were analysed by 
Alphalyse A/S (Odense, Denmark) by MALDI-TOF tryptic peptide 
mass fingerprinting and MALDI-TOF/TOF peptide sequencing after 
treatment of the protein with trypsin. 



Results 

In silico characterization of the B. distachyon 
CAD genes 

In silico screening of the Brachypodium genome (www.brach- 
ypodium.org) identified seven putative CAD genes numbered 
according to Guo et al. (2010): BdCADl: Bradi5g04 130.1; 
BdCAD2: Bradi5g21550.1; BdCAD3: Bradi4g29770.1; Bd 
CAD4: Bradi4g29780.1; BdCAD5: Bradi3g06480.1; BdCAD7: 
Bradi3gl7920.1; and BdCAD8: Bradi3g22980. However, in a 
previous study on the evolution of the CAD/SAD gene fam- 
ily, Guo et al. (2010) annotated Bradi3gl0580.1 as BdCAD6. 
Bradi3gl0580. 1 was not considered further in this study as 
the deduced protein sequence lacks characteristics impor- 
tant for CAD activity. For further details, see Supplementary 
Table S3. The CAD genes were distributed on three of the 
five chromosomes of Brachypodium: BdCADl and BdCAD2 
on chromosome 5, BdCAD3 and BdCAD4 next to each other 
on chromosome 4, and BdCAD5, BdCAD7 and BdCAD8 on 
chromosome 3. 

Structural analysis of the isolated Brachypodium CAD 
genes revealed different intron-exon patterns both in relation 
to position and number of introns, which ranged from one to 
four per gene (Fig. 1). Furthermore, substantial differences 
in the size between the exons were observed. Comparison of 
the positions of the splicing sites (intron locations) between 
the seven CAD sequences showed conserved domains without 
splicing sites. A 114-bp sequence (black boxes in Fig. 1) was 
found in all seven sequences and encoded the putative bind- 
ing site I for the monolignol substrate according to the model 
proposed by Youn et al. (2006) (see also annotations on the 
alignment in Fig. 4). The 114-bp sequence was identical to 
the second exon flanked by introns in BdCAD2, -3, -4, -5, 
and -8. Relative to these, the 114-bp sequence was merged 
with exonl and exon3 in BdCADl and merged with exon2 in 
BdCAD7. 

The last intron in BdCAD2, -7, and -8 was located in the mid- 
dle of the helix structured monolignol-binding site II. This spli- 
cing site was conserved relative to the amino-acid sequences of 
binding site II AFALJ.VGG, AFALJ.VAK, and ATLNJ.LGA in 
BdCAD2, 7, and 8, respectively, approximately 200 nucleotides 
upstream from the stop codon. The first exon of BdCAD4 con- 
tained additional 177 nucleotides encoding a putative 59 amino- 
acid signal peptide. No signal peptides could be predicted in any 
of the other six CAD genes. 



BdCADl Bradi5g04130.1 
BdCAD2 Bradi5g21550.1 
BdCAD3 Bradi4g29770.1 
BdCAD4 Bradi4g29780.1 [ 
BdCAD5 Bradi3g06480.1 
BdCAD7 Bradi3g17920.1 I f 

89 

BdCAD8 Bradi3g22980.1 O 




Fig. 1. Intron-exon structure of Brachypodium distachyon CAD genes. Boxes represent exons, black indicating the splicing-site-free 
box coding for binding site I; numbers are the size of the exons measured in nucleotides. 
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In vitro characterization of the BdCAD genes 

Five genes, BdCAD2, -3, -4, -5, and -7, were isolated from 
genomic DNA from the Bmchypodium genotype Bd21- 
3. BdCAD3 and BdCAD5 were also isolated from cDNA. 
Sequencing of the isolated genes revealed minor sequence dif- 
ferences between the genotypes Bd21 (database) and Bd21-3 
(this study). For BdCAD3 the sequence of the isolated cDNA 
aligned perfectly with the cDNA predicted from the database. 
For BdCAD5 three nucleotide differences between the database- 
predicted cDNA and the isolated cDNA were identified. None of 
these resulted in changes in the deduced amino-acid sequence. 

BdCAD gene expression in tissues 

The expression patterns of five BdCAD genes were ana- 
lysed by semi-quantitative reverse-transcription PCR in four 
Bmchypodium tissues at the seed-filling stage: root, stems, leaf, 
and spike (Fig. 2). BdCAD5 was the only gene to be expressed in 
all tissues and the relative expression varied across tissues, with 
the highest expression in root and stem. BdCAD2 was expressed 
in the root, stem, and spike, with the highest expression in the 
stem. BdCAD3 was expressed in stem and spike and BdCAD4 
was solely expressed in the stem. BdCAD! was expressed in all 
tissues except the root, showing highest expression in the leaf. 

Characterization of the B. distachyon CAD proteins 

The 5g?CAD5 protein sequence was found most similar to CAD 
proteins from Festuca arundinacea, wheat, and Lolium per- 
enne, with identities close to 90%. Further phylogenetic analysis 
revealed that BdCAD5 clustered with AtCAD5, SbCADl, and 
ZmCAD2 (group I, Fig. 3), all of which have been character- 
ized as bona fide CAD genes involved in lignin biosynthesis. The 
identity between 5<iCAD5 and ^4?CAD5 was 73% whereas the 
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Fig. 2. Semi-quantitative reverse-transcription PCR of CAD gene 
expression in different tissues of Brachypodium distachyon Bd21 - 
3. RNA was extracted from root, stem, leaf, and spike at the seed- 
filling stage. The ubiquitin gene B6UBI4 served as reference of 
expression. Expression was detected for five of the seven BdCAD 
genes, as indicated. 



identity to SbCAD2 and ZmCAD2 was 86%. BdCAD3 in group 
Illb had the highest sequence identities to CAD proteins from 
maize, sorghum, rice, and L. perenne, in the range of 78-83%. 
BdCAD2, BdCAD3 or BdCADA, and BdCADl did not group 
with the bona fide CADs, as indicated by identities to ZmCAD2 
in the range of 46-51%. BdCAD2 in group Ilia had the highest 
sequence identities with sorghum, maize, rice, and L. perenne, 
in the range of 68-74%. BdCAD4 in group Ilia had the high- 
est sequence identity to CAD proteins from rice and sorghum, 
in the range of 80-84%. For BdCADl in group Ilia, the high- 
est identities were observed with L. perenne, maize, rice, and 
sorghum, in the range of 82-85%. When comparing 86 plant 
CAD sequences from database searches, five groups were found: 
groups I-V. BdCAD5 was positioned in the bona fide group I, 
whereas BdCAD2, -3, -4, and -7 were located in group Illb, 
BdCAD 1 in group IV and BdCADS in group V. Sequences from 
C3 plants (maize, rice, Festuca, Brachypodium, and barley) fell 
in the same subgroups splitting from sequences from C4 plants 
and sequences from monocot and dicot species were present in 
all five subgroups. 

The protein sequences of selected bona fide CAD sequences 
(from group I) were aligned together with BdCAD5, BdCAD3, 
and SZ>CAD8-4 (Fig. 4) and the overall identity for the selected 
sequences was 37% (135 invariant residues out of a total of 368). 
The cofactor- and zinc-binding sequences conserved in alco- 
hol dehydrogenases ADHs were present in all BdCADs. The 
Znl -binding motif and structural Zn2 consensus regions were 
located at amino acids 48, 70-71, 164, and 101, 104, 107, and 
115, respectively. The NADP-binding domain was identified as 
residues 193, 212-217, and 276 and the sequences contained 
also a GLGGV(L)G sequence in the loop between |3-sheet 10 
and a-helix 5 (Fig. 4); known as a Rossmann fold (Rossmann 
et al, 1974). Of the seven invariant Cys, six were related to 
the Zn-binding site. The residues related to binding of cofactor 
NADP(H) and aldehyde substrate showed high degree of simi- 
larity throughout the alignment. 

The protein sequences of the seven putative BdCADs found 
by in silico screening of the Brachypodium genome were aligned 
(Supplementary Fig. SI). Out of 379 amino-acid residues, 89 
were invariant (23% identity). The amino-acid residues 58HL59 
in the BdCAD5 sequence, believed to be part of the catalytic 
mechanism as a proton donor, were not found in the other 
BdCADs that contain either EW or DW dipeptide, except for 
BdCADl which contains QH indicating a change from E to Q as 
a result of a G— >C base transversion. This change in one of the 
proton-donating residues according to the model of Youn et al. 
(2006) may lead to an inactive BdCAD%. 

The protein sequence of BdCADA was predicted to have a 
chloroplast or mitochondrial signal peptide by TargetP, ChloroP, 
and SignalP algorithms (Emanuelsson et al, 2007) and also 
OsCAD8B, 0sCAD8C, SKAD4-4, S&CAD8-1 and//vCAD8B 
contain these putative signal peptides. Mannitol dehydrogenase 
from Zea mays (ZmMTD, accession ACG28500) showed 50% 
identity to BdCAD4 and it contains a similar chloroplast signal 
peptide and all residues known to be critical for CAD activity. 
The reliability class for the prediction in TargetP was the low- 
est of all tested sequences with a value of 2. Besides an obvious 
need for ADH activity in these organelles, there is no hint in 
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Fig. 3. Phylogenetic analysis of 86 selected putative and experimentally demonstrated CAD protein sequences from monocots and 
dicots. Brachypodium protein sequences are in blue, sequences of monocots are in bold, and dicots are in normal. The phylogenetic 
tree was estimated using the maximum-likelihood method implemented in the PhyML program aLRT version 3.0 and the bootstrap 
method with 1 00 repetitions was used to determine the certainty of the branch topology. The phylogenetic tree were made using 
TreeDyn (version 198.3). Numbers at nodes are bootstrap values (>0.75, or >75%). The sequences of Brachypodium distachyon (Bd) 
were retrieved from the Ensembl Genomes project: Triticum aestivum (Ta) were retrieved from WheatEST; Arabidopsis thaliana (At), Aralia 
cordata (Ac), Eucalyptus gunnii (Eg), Eucalyptus globulus (Egl), Festuca arundinacea (Fa), Hordeum vulgare (Hv), Lolium perenne (Lp), 
Medicago sativa (Ms), Oryza sativa (Os), Panicum virgatum (Pv), Picea abies (Pa), Pinus taeda (Pta), Populus tremuloides (Pot), Sorghum 
bicolor (Sb), and Zea mays (Zm) from GenBank; Nicotiana tabacum (Nt) from UniProt; and Populus trichocarpa (Poptr) sequences from 
Phytozome. Protein names and identifiers are shown in Supplementary Table S2. 



the literature to any detectable CAD activity in chloroplast or 
mitochondria. 

Optimization of the enzyme assay 

Since the absorbance of coniferyl aldehyde was found to be 
strongly pH dependent, due to the phenol-phenolate equilibrium, 



the UV-Vis spectra in the region 200-550 nm of the diluted solu- 
tions of coniferyl aldehyde were recorded (Supplementary Fig. 
S2A). The changes in absorbance correlated the increase in pH 
resulted in a decrease at 221, 238, and 338 nm and an increase 
in absorbance at 268 and 401. A total of four isosbestic points 
at 213, 251, 278, and 360 nm were observed in the spectra. The 
molar absorbance coefficients at 340 nm and 400 nm were plotted 
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Fig. 4. Alignment of the protein sequences of SdCAD3 and SdCAD5 and other putative lignin CADs. The alignment also includes 
sequences from rice (0sCAD2), switchgrass (Pi/CAD1), maize (ZmCAD), sorghum (SbCAD2 and SbCAD8-4), eucalyptus (£gCAD2), 
Arabidopsis (/4fCAD5), Populus tremuloides (PotSAD) and tobacco (A/fCAD19). Secondary structure elements extracted from the protein 
structure of AfCAD5 (accession 2CF5) are indicated at the top and those from PotSAD (accession 1 YQD) at the bottom. Arrows indicate 
p-sheet structure, helices indicate helical structure motifs, and TT indicate turns. The amino acids believed to be part of the active sites 
as discussed in Youn ef al. (2006) are shown below the alignment with following codes: * = NADP + binding; •= monolignol-binding site 
I; •monolignol-binding site II; = Zn-binding catalytic; ▲ = Zn-binding structural. 



as function of the pH value, and the pK a of coniferyl aldehyde 
was determined to be 8.09±0.04 (Table 1) by fitting the func- 
tion e total = (e acid + e base x l(T (p& - pH) )/(l + iq^-p^) to the spec- 
tra of coniferyl aldehyde at different pH values (Supplementary 
Fig. S2B). The used function was derived by relating the law 
of Lambert-Beers to the mass balance and the acid constant 
relations. 

Characterization of the recombinant B6CAD3 and 
B6CAD5 enzymes 

Recombinant BdCAD3 and BdCAD5 were expressed in 
E. coli as 6 x His-tagged proteins and purified to homogeneity 
in three steps: Ni-affinity capture, followed by anion exchange 



chromatography, and finally by size exclusion chromatography. 
The relative molar masses were determined by SDS-PAGE to 
be 48.7 and 49.0 kDa for BdCAm and BdCAD5, respectively 
(Supplementary Fig. S3). The theoretical molar masses of 
BdCAD3 andfidCAD5 were calculated to be 37.7 and 38.5kDa, 
respectively (Supplementary Table S3). 

Using size exclusion chromatography, the molar mass of 
BdCAD5 was determined to be 40.3 kDa. The elution pro- 
file (data not shown) was almost symmetric with a shoulder 
at 97.4kDa. The total area of the peak could be deconvoluted 
into two peaks covering 76% of the area at 40.3 kDa and 
24% at 97.4kDa. This observation indicated that active solu- 
ble BdCAD5 existed as equilibrium between a monomer and 
homo-dimer form. 
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Table 1 . The parameters of the function e tota i = [E acid + s base x 
1 o-<P Ka -P H V(l + 1 cr< pKa - pH) ] fitted to molar absorption at 340 and 
400 nm as a function of pH 



Wavelength (nm) 


Eacid (M _1 cm" 1 ) 


Ebase (M" 1 Cm" 1 ) 




340 


23155.7 + 207.8 


5626.7 + 267.1 


8.09 ±0.039 


400 


495.8 + 108.4 


35634.7 + 138.2 


8.07 + 0.010 



Data are presented in Supplementary Fig. S2. The function was 
derived from the expression of the acid constant, the mass balance, 
and the law of Lambert-Beers and describes the pH dependence 
of coniferyl aldehyde with respect to the pK a of the phenol group. 
E aci(J and Ebase are the molar absorption coefficients of the protonated 
(phenol-form) and the unprotonated (phenolate form) of coniferyl 
aldehyde, respectively. The molar absorption coefficient to any pH 
needed can be calculated using the formula and the constants. 
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The purified BdCAD3 and BdCAD5 proteins were analysed 
by MALDI-TOF peptide mass fingerprinting and MALDI-TOF/ 
TOF peptide sequencing after treatment with trypsin. The pep- 
tides were blasted against both the NCBI database of general 
non-redundant sequences and against a database of protein 
sequences covering 31,029 protein sequences at www.brachy- 
podium.org. The significant best-matching protein based on a 
probability-scoring algorithm with BdCAD3, with sequence 
coverage 37%, was Bradi4g29770.1 and with BdCAD5, with 
58% sequence coverage was Bradi3g06480.1, both found in the 
Brachypodium.org database. 

The optimal pH for BdCAD3 and BdCAD5 were investigated 
for both reducing and oxidizing catalytic activity (Fig. 5A). The 
optimal specific activity for the reduction process for BdCAD3 
andBdCAD5 were determined to be pH 5.2 and 6. 1, respectively. 
The oxidation of coniferyl alcohol showed maximal activity at 
pH 8.0 and 9.7, respectively. For both the reduction and oxida- 
tion process, the specific activity of BdCAD5 was significantly 
higher than for BdCAB3. 

The reduction activity was maximal at 50-55 °C for both 
BdCAD3 andfit/CAD5 (Fig. 5B) and as for the pH dependency, 
BdCAD5 showed significantly higher specific activity compared 
withfi<iCAD3. 

The stability of purified BdCAD3 and BdCAD5 was moni- 
tored by the specific activity using coniferyl aldehyde as sub- 
strate. There was no drop in activity over a period of 12 days 
after dilution at 4 °C (data not shown). 

The steady-state parameters for the purified BdCAD3 and 
BdCAD5 were determined using coniferyl aldehyde and 
coniferyl alcohol as substrates (Table 2). BdCAD3 andfi<iCAD5 
differed substantially in their overall catalytic properties. The 
efficiency of the reduction of coniferyl aldehyde by BdCAD5, 
expressed as the values k C!l /K m , differs by a factor of 50 com- 
pared with BdCAD3. For coniferyl aldehyde, the K m value for 
BdCAD5 was 10-times lower than for BdCAT)3, which could 
indicate a difference in choice of primary substrate. The same 
was observed when comparing the ability to oxidize coniferyl 
alcohol. BdCAD5 showed a performance 100-times higher than 
that determined for BdCAY)3. However, the maximal velocity of 
the conversion of coniferyl alcohol by BdCAT)3 was found to 
be 4.6-times higher than for BdCAT)5, but the overall efficiency 
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Fig. 5. The purified BdCAD3 and BdCAD5 were both investigated 
regarding the pH optima (A) for the conversion of coniferyl 
aldehyde and coniferyl alcohol, and for the optimal assay 
temperature (B). The assay temperature in the pH experiment was 
23 °C and the NADP + concentration in the assay for conversion 
of coniferyl alcohol was 137 jiM. The enzymatic conversion 
was normalized to the protein concentration. The vertical error 
bars indicate the root mean square values of three independent 
measurements. 



was 100-times lower than for BdCAD5 because of a very high 
K m value. For comparison, the conversion of sinapyl aldehyde 
was tested as well. The enzyme batch had been in the freezer for 
6 months, resulting in a slight reduction in activity. All measure- 
ments were performed on enzyme batches stored under the same 
conditions and time for reliable comparison. For BdCAD3, the 
values were AT m 17.1 vs. 6. 1 |iM, F max 92.6 vs. 25.4 nkat (mg pro- 
tein) ^ £ cat 5.2 vs. 1.4 s -1 , and kJK m 0.30 vs. 0.23 s" 1 uJvT 1 . For 
BdCAD5, the values were K m 4.9 vs. 6.5 |iM, F max 513 vs. 589 
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Table 2. The kinetic parameters of the purified Bc/CADS and BdCAD5 with coniferyl aldehyde and coniferyl alcohol as substrates 



Enzyme 


Substrate 


K m (HM) 


V max (nkat mg ) 


«cat (S ) 


«catK m (S ) 


ScfCAD3 


Coniferyl aldehyde 


33.3 + 3.80 


154.3 + 6.86 


8.66 


0.26 




NADPH 


1.30 + 0.12 


114.9 + 2.04 


6.45 


4.96 




Coniferyl alcohol* 


509.1 + 134.0 


611.5 + 72.1 


34.33 


0.067 




NADP+* 


66.0 + 11.79 


93.3 + 3.31 


5.24 


0.079 


SdCAD5 


Coniferyl aldehyde 


3.1 +0.44 


492.3 + 19.16 


38.60 


12.87 




NADPH 


7.2 + 1.7 


487.1 +32.1 


38.19 


5.30 




Coniferyl alcohol* 


1.44 + 0.24 


131.8 + 4.55 


10.35 


7.19 




NADP + 


0.72 + 0.10 


125.9 + 3.55 


9.87 


13.71 



The parameters were calculated by fitting the Michaelis-Menten equation on initial rate experimental data using non-linear fitting (n = 3; except 
*, n = 1) using OriginPro (Originlab). Reactions were carried out at 23 °C and the protein concentrations were 0.42 (SdCAD5) and 0.58 (ig/ml 
(BdCAD3). 



nkat (mg protein) \ k cat 40.2 vs. 45.7 s and k c JK m 8.28 vs. 7.03 
s -1 uM~' (data given as coniferyl aldehyde vs. sinapyl aldehyde). 

Predicted protein structures of B6CAD3 and B6CAD5 

The high similarity between the predicted structures of BdCAD3 
and BdCAD5 is presented in Fig. 6. The structures of the two 
proteins were in general very similar to the overall fold of 
^CAD5, PofSAD, and other ADHs with known structures. 



The (3-sheets (313-(317 (|3D-(3F in^?CAD5) were intact in both 
BdCAD3 and BdCAD5 and it is likely that the dimer struc- 
ture known from PotSAD and AtCAD5 has been maintained 
in the active form of 5<iCAD3 and BdCAD5. The main diffe- 
rence between the overall fold of BdCAD3 and BdCAD5 was 
an additional helix in the C-terminal end of 5<iCAD3. The 
structures were composed by the catalytic- and the nucleotide- 
binding domains. The catalytic -binding domain contained nine 
(3-sheets and six a-helices and the nucleotide-binding domain 




BdCAD3 BdCAD5 

Fig. 6. In silico models of the homo-dimers of BdCADS (left) and 6c/CAD5 (right). Structures were modelled using Modeller (Sali and 
Blundell, 1993). The template for SdCAD3 was PotSAD (1 YQD) and for edCAD5 was/\fCAD5 (2CF5), Zinc ions are shown as grey 
spheres and cofactor NADP + as green sticks. 
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Table 3. Comparison of the proposed key amino-acid residues 
found in the predicted active sites oMfCAD5, PofSAD, BdCAD3, 
and Bc/CAD5 



AfCAD 


/WCAD5 


SdCAD5 


PotSAD 


BcfCAD3 


Notes 


position 












49 


T 


T 


S 


T 


NADP-binding/ 
H-shuttle 


52 


H 


H 


H 


H 


NADP-binding 


53 


Q 


Q 


S 


I 


Substrate-binding 


57 


D 


H 


D 


E 


H-shuttle 


58 


L 


L 


W 


W 


Substrate-binding 


60 


M 


A 


F 


N 


Substrate-binding 


70 


E 


E 


E 


E 


Zn-binding 


95 


C 


V 


C 


Y 


Substrate-binding 


119 


W 


w 


L 


L 


Substrate-binding 


192 


V 


V 


L 


L 


NADP-binding 


21 1 


s 


s 


S 


S 


NADP-binding 


212 


s 


s 


T 


S 


NADP-binding 


213 


s 


s 


S 


S 


NADP-binding 


216 


K 


K 


K 


K 


NADP-binding 


276 


V 


V 


A 


A 


Substrate-binding 


286 


P 


p 


F 


Y 


Substrate-binding 


289 


M 


M 


I 


I 


Substrate-binding 


290 


L 


L 


A 


T 


Substrate-binding 


299 


F 


F 


G 


G 


Substrate-binding 


300 


I 


I 


I 


V 


Substrate-binding 


Ratio to 


20/20 


17/20 


7/20 


7/20 




/\fCAD5 













Residues were selected based on the published structures of 
/4fCAD5 and PofSAD. 



contained six (3-sheets and five a-helices. The amino acids in the 
catalytic and the structural Zn sites, respectively, were all fully 
conserved in BdCAD3 and BdCAD5 as seven Cys and one Glu. 
According to the alignment (Fig. 4), the residues forming the 
binding pocket in BdCAD3 were similar to the residues found in 
PotSAD (Table 3). The substrate-binding cavity of BdCAD3 was 
similar to PofSAD whereas BdCAD5 was similar to AtCAD5. 
The two Zn-binding sites both showed the general patterns 
seen in other CAD enzymes: GHE(X) 2 G(X) 5 V for the catalytic 
Zn-binding site and GD(X) 10 C(X) 2 C(X) 7 C for the structural 
Zn-binding site. Furthermore the NADP(H)-binding site pattern 
G(X) 3 G(X) 2 GLGG(X)GH(X) 2 VK(X) 2 K(X) 2 G-(X)VTV(X)S(X) 
S(X) 2 K was found in all 86 sequences. 

Discussion 

The aim of the present study was to identify, isolate, and 
characterize genes coding for the CAD enzyme in B. dis- 
tachyon, the model for temperate grasses. In silico screen- 
ing of the Brachypodium database using known CAD genes 
as queries yielded seven CAD-like genes coding for putative 
CAD enzymes. The amino-acid identity between the BdCAD 
enzymes ranged from 41 to 67%, compared with approximately 
90% for the bona fide CADs across plant species. BdCAD6 was 
excluded from further analysis because its protein sequence 



lacked vital amino acids responsible for the coordination of the 
Zn metal ions. 

Identification of CAD genes in Brachypodium 

Of the identified Brachypodium CADs, BdCAD5 showed high 
sequence similarity to group I (Fig. 3), which have been bio- 
chemically characterized as bona fide CADs (Goffner et ah, 
1992; Kim et ah, 2004). BdCAD5 is thus the most likely can- 
didate gene for lignin biosynthesis in Brachypodium. BdCAD5 
has been shown to be expressed in all tissues tested (Fig. 2) but 
most highly in roots and stems. Similar expression patterns have 
been seen by other group I CAD, such as rice OsCAD2 (Tobias 
and Chow, 2005), and L. perenne LpCAD3 (Lynch et ah, 2002) 
and these two genes also exhibited the highest expression in 
roots and stem. Wheat TaCADl is also high expressed in the 
stem, while has only minor expression in roots (Ma, 2010). The 
similarity of expression patterns between different plant species 
could indicate similar functions for these genes. 

Localization of CAD enzymes 

Six of the seven CAD genes were devoid of N-terminal signal 
peptides and they were likely cytosolic, which is consistent with 
the immunolocalization studies of these genes in sugarcane and 
maize (Ruelland et ah , 2003). The signal peptide in BdCAD4 was 
found to be 59 residues but showed varying length in //vCAD8B, 
6>sCAD8B and -C, S6CAD4-4 and S6CAD8-1, ZmMTD, and 
PoptrCAD9. These organellar CAD-like enzymes were placed 
in the large group III in the phylogenetic tree (Fig. 3), which did 
not provide much information about their function besides a gen- 
eral need for ADH activity in the plastids. BdCAD4 was located 
next to BdCAD3 on chromosome 4, which could be a result of 
a duplication event, where the number of and localization of the 
introns were conserved (Fig. 1). However only BdCAD4, and not 
BdCAD3, had a signal peptide, and the identity between the two 
mature proteins was only 76%. Furthermore, BdCAD4 was only 
weakly expressed and in stem tissue only, whereas BdCAD3 was 
highly expressed in the stem and less in the spike. These differ- 
ences in expression patterns were supported by low similarities 
between the promoter sequences (data not shown). The position 
of BdCAD3 and BdCAD4 on chromosome 4 is syntenic to rice 
chromosome 9 and sorghum chromosome 2 (Vogel et ah, 2010). 
In both rice and sorghum, two homologous genes were found, of 
which one has the signal peptide. It is likely that duplication was 
followed by diversification of the two genes including signal- 
peptide and promoter sequences and that they were inherited as a 
unit in order to preserve functionality. 

Characterization of expressed BdCAD enzymes 

After purification, the molecular mass of BdCAD5 was estimated 
to be 40 and 97kDa by size exclusion chromatography. The 
functional unit of AtCAD5 has been shown to be a homo-dimer 
tightly associated through two two-fold related |3-strands (Youn 
et ah , 2006). This taken into account indicated that BdCAD3 and 
BdCAD5 are likely to maintain the dimer structure on the basis 
of the intact (3-sheets (313-1317 (|3D-|3F inJ*CAD5). Previously 
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characterized CADs have been found to be catalytically active in 
either a dimeric form as AtCAD5 (Youn etal, 2006) and£gCAD2 
(Goffner et ah, 1992) or in a monomeric form as isolated from 
Phaseolus vulgaris (Grima-Pettenati et ah, 1994; Hawkins and 
Boudet, 1994) and from the bacterium Helicobacter pylori (Mee 
et ah, 2005). E. gunnii CADI (iigCADl) has been isolated in 
an active monomeric form (Goffner et ah, 1992). However, as 
discussed in the introduction, £gCADl has been shown to be 
significantly different from other CAD-like enzymes such as 
AtCAD5 and BdCAT)5. The current data indicate that a mono- 
mer-homo-dimer equilibrium exists for BdCAY)5. 

The molar absorbance coefficient, e, of coniferyl aldehyde, 
was determined as a function of pH (Table 1 and Supplementary 
Fig. S2) and was higher compared with the values reported by 
Wyrambik and Grisebach (1975) as 15.800 Mr 1 cm 1 at 340 nm, 
pH 6.5 and 21,000 M -1 cm -1 at 400nm, pH 8.8 and the values 
reported by Larroy et ah (2002). The reason for this differ- 
ence could be the purity of the coniferyl aldehyde used. The 
estimation of the pK a for the coniferyl aldehyde is found to be 
in good agreement with the value of 7.94 reported by Ragnar 
et ah (2000). 

The conversion of coniferyl aldehyde to coniferyl alcohol by 
the expressed and purified BdCAD3 and BdCAD5 was shown 
to be strongly pH dependent (Fig. 5) with optima at pH 5.2 and 
6.2, respectively. The observed difference in optimal pH between 
BdCAD3 and BdCAD5 could partly be explained by the amino- 
acid differences at position 57-58 (Fig. 4, AtCAD5 numbering). 
BdCAD3 had an EW motif whereas BdCAD5 had an HL motif. 
In AtCAD5 the residue was found to be an Asp which could be 
one of the key residues in the catalytic mechanism (Youn et ah , 
2006; Saathoff et ah, 2010). CADs associated with cell-wall lig- 
nification and displaying significant catalytic activity towards 
monolignols have the HL or DL dipeptide motif (Saathoff et ah, 
2010). However, as wheat CADI contains a DL motif and dis- 
plays a pH optimum of 7 (Ma, 2010) the two residues are not 
likely to be the sole key residues determining the regulation of 
the catalytic pH optimum. 

Even though BdCAD3 and BdCAD5 contained a large num- 
ber of residue changes compared with AtCAD5 and Po^SAD 
(Table 3), they both remained similar in the overall structure 
(Fig. 6). According to the modelled structures of BdCAD3 
and BdCAD5, the overall fold is very similar to other CADs, 
both bona fide and other types. Of the 20 amino acids located 
in the active site of AtCAD5, 17 were conserved in BdCAD5 
and only seven m BdCAD3 , none of which related to the bind- 
ing of the monolignol moiety (Table 3). The efficiency (& ca / 
K^) of the reduction of coniferyl aldehyde by BdCAD3 and 
BdCAD5 displayed a difference by a factor of 50 in favour of 
BdCAI)5. The KJK m of BdCAT>5 is comparable to 7aCADl 
(Ma, 2010), ^CAD5 (Kim et ah, 2004), and Pv/CADl and -2 
(Saathoff et ah, 2010), which have all been classified as bona 
fide CADs. The efficiency (k c JK m ) and K m of BdCAD5 in the 
conversion of coniferyl aldehyde and sinapyl aldehyde was 
comparable and the same was observed for BdCAD3, but with 
significantly lower values compared with BdCAD5 (Table 2). 
The difference in the K m values for BdCAD3 and BdCAB5 
with coniferyl aldehyde could indicate a difference in substrate 
preferences even though BdCAD3 seemed to have a low K m 



value for sinapyl aldehyde but still a slow conversion (low k cat ) 
and low effectivity compared with BdCAD5 (much lower & ca / 
K m ). The value of K m for BdCAD5 was lower than observed for 
AtCAD5/AtCAD4 and 7aCADl and comparable to Pv/CAD2, 
indicating an even-more dedicated enzyme in the conversion of 
monolignols. 

When comparing the substrate -binding pocket of BdCAD3 to 
that of AtCAD5, substitutions in the involved amino acids were 
generally going from larger to smaller residues. Such substitu- 
tions would probably result in a change of size, conformation, 
and hydrophobic ity of the binding pocket. The roles of BdCAD3 
and this group of CAD enzymes have not yet been elucidated. 
The substrate -binding site in BdCAD3 was equivalent to the sub- 
strate cavity of Po^SAD when compared with AtCAD5 (Table 3). 
However, the catalytic activity of BdCAD3 against sinapyl alde- 
hyde was lower than against coniferyl aldehyde, even though a 
slight lower K m was observed. The findings that it is not possible 
to determine the preferred substrate solely from the composition 
of the binding pocket {AtCAT>5 sequence vs. PotSAD sequence) 
are in agreement with Anterola and Lewis (2002) who reinter- 
preted the data of Bomati and Noel (2005). The CAD and CAD- 
like enzymes have only a subsidiary role and do not entirely 
determine the final composition of H-, G-, and S-lignin. 

PotSAD has been associated with lignin biosynthesis (Li 
et ah, 2001) and^CAD7 (ELI3-1) and^CAD8 (ELI3-2) have 
been associated with plant defence compounds (Trezzini et ah, 
1993; Somssich et ah, 1996). AtCADl has been characterized 
as a benzyl alcohol dehydrogenase with low catalytic activity 
against monolignol compounds (Somssich et ah, 1996). Other 
CADs than bona fide CADs have additional roles such as plant 
defence or catalytic activity under specific conditions (Barakat 
et ah, 2009; Saathoff et ah, 2010). S6CAD4 has shown poor 
activity against monolignol substrates (Sattler et ah, 2009), 
similarly to BdCAT)3, indicating that these types of CADs have 
other functions in planta. The substrate -binding site is not the 
only determinant for understanding the different catalytic poten- 
tials and the roles in planta of the different CAD and CAD-like 
enzymes in Brachypodium. 

In conclusion, biochemical and phylogenetic studies have 
identified targets CAD genes for gene modification. A TILLING 
strategy will be employed for identifying Brachypodium lines 
with mutations in these genes, in order to determine their in 
planta function. In addition, the bioinformatics lead to the sug- 
gestion of a chloroplastic location of a single CAD-like enzyme 
across plant species, with an as-yet unclear function, to be eluci- 
dated in the near future with biochemical analysis. 

Supplementary material 

Supplementary data are available at JXB online. 

Supplementary Fig. S 1 . Alignment of the seven Brachypodium 
protein sequences. 

Supplementary Fig. S2. pH-dependent UV-visible spectra of 
coniferyl aldehyde. 

Supplementary Fig. S3. SDS-PAGE analysis of the purified 
BdCAD3 and BdCAD5. 

Supplementary Table S 1 . Primers used in the isolation of CAD 
genes and semi-quantitative reverse-transcription PCR. 
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Supplementary Table S2. List of plant proteins used in CAD 
protein phylogenetic analyses. 

Supplementary Table S3 . Putative CAD genes mBrachypodium 
distachyon. 
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